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TRAP-Tagging: a novel method for the identification and purification of 

RNA-protein complexes. 



Abstract 

With the recent completion of several genome sequencing projects, scientists have embarked on 
comprehensive attempts to unravel all of the interactions amongst their gene products. While 
many proteomics efforts are well under way, little attention has been paid to the KNA products 
of these genes. The variety and scope of roles that RNA molecules play in biological processes is 
only now beginning to be appreciated. Consequently, "ribonomics** will likely be the next focus 
of genomic efforts. Unfortunately, conventional methods for the isolation and identification of 
specific RNA-protein complexes are plagued by a number of problems not encountered in 
genomics or proteomics. Here we describe a method that circumvents these problems. The 
TFLAP (Tandem RNA Affinity Purification) tag is a dual RNA tagging system that facilitates 
gentle purification of RNA molecules along with the proteins, RNAs and other small molecules 
specifically associated with them* 
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Baek^aDd 

In addition to servii^ as essential iatemediates between genes and proteins. RNA molecules 
also serve struchnal and regulatcny roles in a rapidly growing list of biological processes. These 
include all of the basic steps of niRNA processing such as spUcing, nuclear to cytoplasm 
transport, translation, and decay (Doudm and Rath, 2002; Er^nann et al., 2001; Pesole et al., 
2001). Other known fiinctions include the regutetion of transcript mitiation (Beddiout et aL, 
1989). dosage compensation (Bell ct al., 1988; Lee and Jaenisch. 1997; Meller et al.. lOOd; 
Salido et al., 1992), telomere maintenance (Le et al., 2000) and DNA replication. Importantly, 
the genomes of many viruses are encoded as RNA rather than DNA, and much of their infective 
cycles are controlled by RNA biochemistry (Berkhout et al., 1989). Clearly, these molecules and 
processes are cnicial for cell and pathogen viability, and are excellent targets for drug 
int^eation. 
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A comprehensive dissection of the protein complexes that incorporate specific RNAs will help 
elucidate their functions, many of which are likely to be novel. However, the methodologies 
currently employed to identify RNA associated molecules are not ideally suited for such an 
endeavor. For example, RNA binding proteins generally do not have the same specificity as 
DNA binding proteins. Consequently, techniques that identify individual RNA-protein 
interactions frequently isolate proteins that are irrelevant to the processes being studied. Indeed, 
there is increasing evidence that many high affinity RNA/protein interactions require multiple 
contacts between several proteins and their cognate elements within the RNA molecule 
(Chartrand et al., 2001). An additional consequence of this complexity is that proteins being 
sought in extracts may ateady be bound in stable RNP complexes, allowing other abundant and 
relatively non-specific RNA binding proteins to bind RNA probes. 

We have mvented a method capable of isolating specific RNA-protein complexes fonned in 
vivo. 

Summary 

This invention provides an isolated DNA construct comprising a transcription cassette, which 
comprises a promoter sequence, a bait sequence operably linked to the promoter, a 
transcriptional teraiination sequence which comprises a stop signal for RNA polymerase and a 
polyadenyiation signal for polyadenylase, and at least two tag sequences. In one embodiment, 
the isolated DNA construct further comprises at least tinee insulator sequmces. In another 
embodiment the isolated DNA construct comprises at least one streptavidin bmding sequence 
[SEQ ID NO:l SEQ ID NO:2] and at least one MS2 coat protein binding sequence [SEQ ID 
NO:4. SEQ ID NO:6 SEQ ID NO:7]. In yet another embodhnent, the isolated DNA construct 
comprises at least one tag sequence which hybridizes to the streptavidin binding sequence [SEQ 
ID NO:2] and at least one tag sequence which hybridizes to the MS2 coat protem sequence [SEQ 
ID NO:4] under high stringency hybridization conditions. 

The invention also provides an isolated DNA constract comprising a transcaiption cassette, 
which construct comprises, a promoter sequence, a bait sequence operably linked to the 
promoter, a transcriptional termination sequence, which comprises a stop signal for RNA 
polymerase and a polyadenyiation signal for polyadenylase; and at least three tag sequences. In 
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one embodiment the isolated DNA construct flirfher comprises at least four insulator sequences. 
In another embodiment the isolated DNA construct comprises at least one streptavidin binding 
sequence [SEQ ID NO:2] and at least two MS2 coat protein binding sequences [SEQ ID NO:7]. 
In yet another embodiment the isolated DNA construct at least one tag sequence which 
hybridizes to the streptavidin binding sequence [SEQ ID NO:2] and at least two tag sequences 
which hybridize to the MS2 coat protein sequence [SEQ ID N0:7] under high stringency 
hybridization conditions. 

The present invention relates to a vector comprising an isolated DNA construct comprising a 
transcription cassette, which comprises a promoter sequence, a bait sequence operably linked to 
the promoter, a transcriptional termination sequence which comprises a stop signal for RNA 
polymerase and a polyadenjlation signal for polyadenylase, and at least two tag sequences. In 
one embodiment, the isolated DNA construct further comprises at least three insulator sequences. 
In another embodiment the isolated DNA construct comprises at least one streptavidin binding 
sequence [SEQ ID NO:2] and at least one MS2 coat protein binding sequence [SEQ ID NO:4]. 
In yet another embodiment, the isolated DNA construct comprises at least one tag sequence 
which hybridizes to the streptavidin binding sequence [SEQ ID NO:2] and at least one tag 
sequence which hybridizes to the MS2 coat protein sequence [SEQ ID NO:4] under hi^ 
stringency hybridization conditions. 

The present invention also relates to a vector comprising an isolated DNA construct comprising a 
transcription cassette, which construct comprises, a promoter sequence, a bait sequence operably 
linked to the promoter, a transcriptional termination sequence, which comprises a stop signal for 
RNA polymerase and a polyadenylation signal for polyadenylase; and at least three tag 
sequences. In one embodiment the isolated DNA construct further comprises at least four 
insulator sequences. In another embodimrat tiie isolated DNA constmct comprises at least one 
streptavidin binding sequence [SEQ ID NO:2] and at least two MS2 coat protein binding 
seqiiences [SEQ ID NO:7]. In yet anofiier raibodiment the isolated DNA construct at least one 
tag sequence which hybridizes to the streptavidin binding sequence [SEQ ID N0:2] and at least 
two tag sequences which hybridize to the MS2 coat protein sequence [SEQ ID NO:7] under high 
stringency hybridization conditions. 



3 



CA 02407825 2002-10-11 



The invention further provides a host cell transfomed with a vector comprising an isolated DNA 
construct comprising a transcription cassette, which comprises a promoter sequence, a bait 
sequence operably linked to the promoter, a transcriptional termination sequence which 
comprises a stop signal for RNA polymerase and a polyadenylation signal for polyadenylase. and 
at least two tag sequences. In one embodiment, the isolated DNA construct further conq>rises at 
least three insulator sequences. In another embodiment the isolated DNA construct comprises at 
least one streptavidin bindmg sequence [SEQ ID NO:2] and at least one MS2 coat protein 
binding sequence [SEQ ID NO:4]. In yet another embodiment, the isolated DNA construct 
comprises at least one tag sequence which hybridizes to the streptavidin binding sequence [SEQ 
ID NO: 1] and at least one tag sequence which hybridizes to the MS2 coat protein sequence [SEQ 
ID NO:4] under hig}i stringency hybridization conditions. 

The invention also provides for a host cell transformed with a vector The present invention also 
relates to a vector comprismg an isolated DNA constmct comprising a transcription cassette, 
which construct comprises, a promoter sequence, a bait sequence operably linked to the 
promoter, a transcriptional termination sequence, which comprises a stop signal for RNA 
polymerase and a polyadenylation signal for polyadenylase; and at least three tag sequences. Jn 
one mibodnnent the isolated DNA construct further comprises at least four insulator sequences. 
In another embodiment the isolated DNA construct comprises at least one streptavidin binding 
sequence [SEQ ID NO:2] and at least two MS2 coat pmtem binding sequences [SEQ ID NO;7]. 
In yet another embodiment the isolated DNA construct at least one tag sequence which 
hybridizes to the str^tavidin bindmg sequence [SEQ ID NO:2] and at least two tag sequences 
which hybridize to the MS2 coat protem sequence [SEQ ID NO:7| under high stringency 
hybridization conditiCTis. 

Another aspect of the invmtion is an RNA fusion molecule comprismg a target RNA sequCTce 
and at least two RNA t^, wherein at least one of the RNA tags interacts with a ligand m a 
reversible feshion. hi one embodiment, the RNA fusion molecule furthw comprises at least fluee 
insulators. Jn another embodiment the RNA fusion molecule comprises at least one streptavidin 
binding tag [SEQ ID NO:3] and at least one MS2 coat protein binding tag [SEQ ID NO:5]. 
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The cuirent invention also relates to an RNA fusion molecule comprising a target RNA sequence 
and at least three RNA tags, wherein at least two of the RNA tags interact with a ligand in a 
reversible fashion. In one embodiment^ the RNA fusion molecule further conq)rises at least 4 
insulators. In another embodiment, the RNA fusion molecule comprises at least one streptavidin 
binding tag [SEQ ID NO:3] and at least two MS2 coat protein binding tags [SEQ ID NO:8]. 

The invention provides a method for isolating an RNA-protein complex fonned in vivo 
comprising, expressing in a eukaiyotic cell an RNA fusion molecule of the currmt invention, 
gmerating a whole cell extract, passing the extract over a first solid support comprising 
streptavidin protein, eluting a first eluate with flie addition of biotm, collecting the first eluate, 
passing the first eluate over a second solid siqiport comprising MS2 coat protein, eluting a 
second elute witii tiie addition of a reag^t selected fiom the group consisting of glutathione, 
RNAse or a d^iaturant, and collecting tfie second elute, wherein the second eluate contains the 
isolated RNA-protein complex. 

The cun^t invmtion provides a metibod of identifying a protein in an RNA-protein complex 
comprising isolating an RN A-protdn conqjlex formed in vivo cinnprising, expressing in a 
eukaryotic cell an RNA fbsion molecule of the current invention, gmerating a whole cell extract, 
passing the retract over a first solid support comprising streptavidin protein, eluting a first eluate 
with the addition of biotin, collecting the first eluate, passing the first eluate over a second solid 
support comprising MS2 coat protein, eluting a second elute with the addition of a reagent 
selected &om the group consisting of glutathione, RNAse or a denaturant, and collecting the 
second elute, wherein the second eluate contains the isolated RNA-protein complex and 
identifying the protdn in the RNA-protein conqslex. 

The invention also provides for a protein identified by isolating an RNA-protein complex formed 
in vivo comprising expressing in a eukaryotic cell an RNA fusion molecule of the current 
invention, generating a whole cell extract, passing the extract over a first solid support 
comprising streptavidin protein, eluting a first eluate with the addition of biotin, collecting the 
first eluate, passing the first eluate over a second solid support comprising MS2 coat protein, 
eluting a second elute with the addition of a reagent selected from the group consisting of 
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glutathione, KNAse or a denaturant. and collecting the second elute. wherein the second eluate 
contains the isolated RNA-protein complex and identifying the protein in the RNA-piotein 
complex. 

Another aspect of the current invention is a method for isolating an RNA-protein complex 
formed in vitro comprising, (a) repressing a RNA fusion molecule of the current invention in 
vitro, (b) obtaining a whole cell extract, (c) passing the whole cell extract over a first soUd 
support comprising streptavidin protein, (d) eluting a first eluate with the addition of biotin, (e) 
coUecting the first eluate, (f) passing the first eluate over a second solid support comprising MS2 
coat protein, (g) eluting a second elute with the addition of a reagent selected fiom the group 
consisting of glutaHriane, KNAse or a denaturant, and (h) coUecting the second eluate, wherein 
the second eluate contains the isolated RNA-protein complex. In one embodiment steps (c) to 
(e)arerq)eated. 

The current invention provides a mediod of identifying a protein in an RNA-protein complex 
comprising isolating an RNA-protein complex formed in vitro comprising (a) expressing a RNA 
fiision molecule of the current invention in vitro, (b) obtaining a whole ceU extract, (c) passing 
the whole ceU extract over a first solid support comprising streptavidin protein, (d) eluting a first 
ehiate with the addition of biotin, (e) coUecting the first eluate, (f) passing the first eluate over a 
second soHd support comprising MS2 coat protein, (g) eluting a second elute with the addition of 
a reagent selected firom the group consisting of glutathione, RNAse or a denaturant, and 0») 
collecting tfie second eluate. vdiorran die second eluate contains tiie isolated RNA<|m>tein 
complex and identifying the protein in the RNA-protein complex. In one embodiment, steps (c) 
to (e) are repeated. 

The invention also provides for a protein identified by isolating an RNA-protein oonqdex fomied 
in vitro conaprising, (a) expressing a RNA fiision molecule of the current invention in vitro, (b) 
obtaining a whole oeU extract, (c) passing the whole ceU extract over a first solid supfwrt 
comprising streptavidin protein, (d) eluting a first eluate with the addition of biotin, (e) collecting 
the first eluate, (f) passing the first eluate over a second solid support comprising MS2 coat 
protein, (g) eluting a second elute with ttie addition of a reagent selected fiom the group 
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consisting of glutathione, RNAse or a denaturant, and (h) collecting the second eluale, wherein 
the second eluate contains the isolated RNA-protein complex and identifying the protein in the 
RNA-protein complex. In one embodiment, steps (c) to (e) are repeated. 

The invention also relates to a method of screening for a compound that modulates the formation 
of an RNA-protein complex formed in vivo comprising, expressing in a eukaryotic cell an RNA 
fusion molecule of the current invention in the presence of a test compound, generating a whole 
cell extract, passing the extract over a first solid support comprising streptavidin protein, eluting 
a first eluate with the addition of biotin, collecting the first eluate, passing the first eluate over a 
second solid support comprising MS2 coat protein, eluting a second eluate with the addition or a 
reagent selected firom the group consisting of glutathione, KNAse or a denaturant, collecting the 
second eluate, wherein the second eluate contains the isolated RNA-protein complex, measuring 
th amount of isolated RNA-protein complex present, and comparing the amount of isolated 
RNA-protein complex present in the absence of the compound to be tested. 

The invention also provides for a method of screening for a compoimd that modulates the 
formation of an RNA-protein complex formed in vitro comprising, (a) expressing an RNA fusion 
molecule of the current invention in vitro, (b) obtaining a whole cell extract, (c) passing the 
whole cell extract over a first solid support comprising streptavidin protein, (d) eluting a first 
eluate with the addition of biotin, (e) collecting the first eluate, (f) passing the first eluate over a 
second solid support comprising MS2 coat protein, (g) eluting a second eluate with the addition 
of a reagent selected &om the group consisting of glutathione, RNAse or a denaturant, (h) 
collecting the second eluate, wherein the second eluate contains the isolated RNA-protein 
complex, (i) measuring the amount of isolated RNA-protein comply present; and (j)comparing 
the amount of isolated RNA-protein complex preset in the absence of the compoimd to be 
tested. la one embodiment, steps (c) to (e) are repeated. 

The invention also provides for a kit for detectuig RNA-protein complexes comprising an 
isolated DNA construct of the current invention. 
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The invention also provides for a kit for detecting RNA-protein complexes comprising a vector 
of the current invention. 

Brief Description of the Drawings 

Preferred embodiments of the invention will be described in relation to the drawings in which: 
FIgare U Tandem RNA alHnity purification. A) RNAs of interest are tagged at their 5* or 3* 
end with two different RNA tags. The tagged RNAs are then expressed either in vitro or in vivo 
and tested for function. Functional complexes containing the tagged RNA are purified flx>m 
extracts using two affinity resins, each of which is capable of binding one of the tags. An 
important aspect of the tags, particularly the first tag used, is that it must be capable of being 
dissociated from its afiBnity resin using conditions that do not disrupt the RNA-protein complex. 
Proteins eluted fi-om the second resin are generally sufSciently pure for identification by SDS 
PAGE, silver stainmg, and Mass Spectrometry. Bound RNAs can also be identified using 
RTPCR or microarray analysis. 

B) Sequence of the TRAP cassette. Sequences in parentheses indicate each of the dififer^t 

fimctional motife within the TRAP casseUe. 

Figure 2. TRAP-tag purification using in vitro transcribed RNA. 

A) In vitro purification of proteins fix>m extracts. Embryonic whole cell extracts were mixed wifli 
TRAP-tagged constructs or control constructs \, and then passed over the two afiBnity columns. 
Eluates were subjected to SDS PAGE and silver staimng. Lane 1 : no RNA added to the extract. 
Lane 2: No bait RNA fused to the TRAP RNA. Lane 3 : purification using TRAP RNA flised to a 
localization element &om the 3'UTR of the DrosophUa wingless gene mRNA (WLEl). . Lane 4: 
protein purification usmg TRAP RNA fiised to a second transcript localizing element in the 
H;iii^/es^mRNA3'UTR(WLE2). Note that the RNAs containmg the two baits (WLEl and 
WLE2) bind proteins that are not bound by ±e resins or TRAP RNA alone. Ihterestmgly, the 
proteins bound sspecifically by WLEl and WLE2 also diflfe from each other. 

B) In vitro purification of Bic-D fix>m embryo extracts. Following the purification as described 
above, eluted protems were subjected to SDS PAGE and then transfened to mmibranes for 
Western blotting with anti Bic-D antiserum. Lanes 1-4 are as described above. Note that the Bic- 
D signal is highly enriched in lanes 3 and 4 after TRAP purification with the WLEl and WUE2 
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localization elements. Bic-D was only detectable in the crude extract after much longer 
exposures. 

Figure 3, Localization of TRAP-tagged WLE RNAs in Drosophila embryos. To ensure that the 
TRAP-tag does not interfere with bait RNA function, WLE localization elements fiised to TRAP 
RNAs were tested for localization activity in embryos. Panel A) shows the apical localization of 
fluorescently labeled WLE2 RNA after injection into a syncitial blastoderm stage embryo. Note 
the red RNA localized above the green labeled nuclei. Panel B) shows the random localization of 
a mutagenized WLE2 element that has no localizing activity. Panel C) shows apical localization 
of TRAP-tagged WLE2, showing localization that is indistinguishable from the untagged 
mRNA. 

Figure 4. A)TRAP tag purification using RNA expressed in vivo. TRAP-tagged WLEl and 
WLE2 localization elements were expressed in transgenic Drosophila embryos, and whole cell 
extracts were passed over tandem afSnity columns. Lane 1 : no TRAP-tagged RNA expressed. 
Lane 2; purification using TRAP-tagged WLEl. . Lane 3: purification using TRAP-tagged 
WLE2. Once again, proteins bound specifically by WLEl and WLE2 differ. 

Table 1. Suitability of tags for TRAP-tag purification. Tags used for afiBnity purification are 
shown in the left hand column. Sizes, affinity matrices, eluting reagents, and performance are 
shown in the colunms to the right Binding and elution efficiencies were determined using 32P- 
labeled RNAs expressed in vitro and are expressed as percentage of label loaded. 

Detailed Description 

The present invention will now be described more fiilly with reference to the accompanying 
drawings, in which prefmed ^bodiments of the invention are shown. This invention may, 
however, be mibodied in different forms and should not be construed as limited to the 
embodiments set forth herein. Radier, these embodiments are provided so that this disclosure 
will be thorough and complete, and will fidly convey the scope of the invention to those skilled 
in the art. 

The term **bait sequence" as used herein, is a cDNA or DNA sequence that encodes a target 
RNA sequence. Examples of suitable bait sequences include RNAs, such as, the HIV Rev- 
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binding tat elemeaU the E. coli N protein binding nut element, and various recognition elements 
within RNA splice sites. 

The term "isolated DNA sequence" as described herem includes DNA whether single or double 
stranded. The sequence is isolated and/or purified (i.e. jfrom its natural environment), in 
substantially pure or homogeneous form, free or substantially free of nucleic acid or genes of the 
species of interest or origin other than the promoter or promoter fragment sequence. The DNA 
sequence according to the present invention may be wholly or partially synthetic. The term 
"isolated" encompasses all these possibilities. 

The term '^operably Unked*' as described herein means joined as part of the same nucleic acid 
molecule, suitably positioned and oriented for transcription to be initiated from the promoter. 

The term •'promoter^ as described herein refers to a sequence of nucleotides from which 
transcription may be initiated of DNA operably linked downstream (i.e. in the 3* direction on the 
sense strand of double-stranded DNA). The promoter or promoter fragment may comprise one or 
more sequmce moti& or elements conferring developmental and/or tissue-specific regulatory 
control of expression. For example, the promoter or promoter fragment may comprise a neural 
or gut-specific regulatory control elemrat. 

The term 'IDNA tag*' as used herein refers to short DNA or cDNA sequences that mcode a 
binding partner for a ligand. The ligand may be any molecule that specifically binds to the 
binding partner such as, antibiotics, antibodies or specific protems. The DNA tags of the cuxrent 
invention may be located 3' or 5* to the bait sequence. DNA tags encode RNA tags. 

The term 'BNA tags" as used herein refers to short RNA sequences fliat functi0n as a binding 
partner for a ligand. The RNA tags must be short, fiilly modular and must not interfere with each 
other or with the target RNA sequence. At least one of the RNA tags must interact with its 
binding partner in a reversible fiishion. 
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The term **transcription cassette" as xised herein refers to a nucleic acid sequence encoding a 
nucleic acid that is transcribed. To facilitate transcription, nucleic acid elements such as 
promoters, enhancers, transcriptional termination sequences and polyadenylation sequences are 
typically included in the transcription cassette. 

The term "SI" as used herein refers to the streptavidin binding sequence as DNA [SEQ ID NO:l 
or SEQ ID NO:2] or RNA [SEQ ID NO: 3] 

The term *'MS2" as used herein refers to MS2 coat protein binding sequence as DNA [SEQ ID 
NO: 4] or RNA [SEQ ID NO:5]. 

The term "2xMS2" as used herein refers to two MS2 coat protein bmding sequences as DNA 
[SEQ ID NO:6 and SEQ ID NO:7] or RNA [SEQ ID NO:8] 

The terminology used in the descrq>tion of the invention herein is for the purpose of describmg 
particular embodiments only, and is not intended to be limiting to the invention. As used in the 
description of the invention and the appended claims, the singular forms "a", "an" and **the" are 
intended to include the plural forms as well, unless the context clearly indicates otherwise. 
Unless otherwise defined, all technical and scientific terms used herein have the same meaning 
as conmumly understood by one of ordinary skill in the art to which tibis invention belongs. All 
publications, patent q>plications, patens and other references mentioned herein are incorporated 
by reference in their entirety. 

The present invmtion relates to a method for isolating specific RNA-piotein complexes foraied 
in vivo. However, it can also be used to isolate or verify complexes formed in vitro. 
In vivo complex formation and purification is accomplished by esxpressing tagged versions of the 
RNA of interest in vivo and tiien using the tag to isolate associated ftmctional RNP complexes. 
Tags in the form of short RNA sequences that interact with specific proteins, antibiotics or 
synthetic ligands can be readily mserted 5* or 3* to the RNA of interest Although a numba- of 
these potmtial RNA tags exist, purification with these tags gives at most a thousand-fold 
purification of the associated RNAs. By using two RNA tags, the TRAP-tag method of the 
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current invention provides approximately a million-fold purification of associated RNAs, which 
is sufficient for the identification of most cellular proteins. The tags in the current invention must 
be relatively short, fully modular, and must not interfere with each other or with the RNA of 
interest. In addition, at least one of flie tags must interact with its ligand in a reversible fashion so 
that RNP conq>lexes can be eluted intact fix>m the first ligand matrix and bound to the second 
matrix (see Fig. 1 A). When expressed in vivo, TRAP-tagged RNAs assemble into fimctioiial 
complexes, and these complexes are readily purified to homogeneity. 

Nucleic Acid Molecules 

Functionally equivalent nucleic acid molecule or polypeptide sequence 
The term "isolated DNA sequence" refers to a DNA sequence the structure of which is not 
idoitical to that of any naturally occurring DNA sequence or to that of any fiagment of a 
naturally occuning DNA sequence spanning more than three separate genes. The term therefore 
covas, for example, (a) DNA which has Ifae sequaice of part of a naturally occurring genomic 
DNA molecule; (b) a DNA sequrace incorporated into a vector or into the genomic DNA of a 
prokaiyote or eukaryote, respectively, in a manner such diat the resulting molecule is not 
identical to any naturaUy occurring vector or genomic DNA; (c) a separate molecule such as a 
cDNA, a genomic fragmoit, a firagment produced by revose transoiption of polyA RNA \i*ich 
can be amplified by PGR, or a restriction fiagment; and (d) a recombinant DNA sequence that is 
part of a hybrid gene, i.e., a gene encoding a fusion protem. Specifically excluded fiom this 
definition are nucleic adds present in mixtures of (i) DNA molecules, (ii) transfected cells, 
and (iii) cell clones, e,g., as these occur in a DNA VSatzry such as a cDNA or genomic DNA 
library. 

Modifications in the DNA sequence, ^ch result in production of a chemicaUy equivalent or 
chemicaUy similar amino acid sequence, are mcluded withm the scope of flie invoition. 
Modifications mdude substitution, insertion or deletion of nucleotides or altering the relative 
positions or order of nudeotides. 

Sequence identity 

The invention includes modified nucleic acid molecules with a sequence identity at least about: 
>95% to the DNA sequences provided in SEQ ID NO: 1 to SEQ ID NO: 4 (or a partial sequence 
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thereof or their complementary sequence). Preferably about 1 , 2, 3, 4. 5, 6, to 10, 10 to 25, 26 to 
50 or 51 to 100, or 101 to 250 nucleotides are modified. Sequence identity is most preferably 
assessed by the algorithm of the BLAST version 2.1 program advanced search (parameters as 
above). Blast is a series of programs that are available online at 
http//www.ncbi.nlm.nih.gov/BLAST. 
References to BLAST searches are: 
References to BLAST searches are: 

Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, DJ. (1990) "Basic local alignment 
search tool." J. Mol. Biol. 215:403_410. 

Gish, W. & States, D J. (1993) "Identification of protein coding regions by database similarity 
search." Nature Genet. 3:266_272. 

Madden, TX., Tatusov, R.L. & Zhang, J. (1996) "Applications of network BLAST server" Meth. 
Enzymol. 266:131_141. 

Altschul, S.R, Madden, T.L., Schaflfer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D X 
(1997) "Gapped BLAST and PSI^BLAST: a new generation of pix>tein database search 
programs." Nucleic Acids Res. 25:3389_3402. 

Zhang, J. & Madden, T.L. (1997) "PowerBLAST: A new network BLAST application for 
int^active or automated sequence analysis and annotation." Genome Res. 7:649_656. 
Other programs are also available to calculate sequence idratity, such as Clustal W program 
0>referably using default parameters; Thompson, JD et al.. Nucleic Acid Res. 22:4673-4680). 
DNA sequences functionally equivalent to the SI SEQ ID NO: 1, or MS2 SEQ ID NO: 3 can 
occur in a variety of forms as described above. 

The sequmces of the invention can be prepared according to numerous techniques. The 
invention is not limited to any particular preparation means. For example, the nucleic acid 
molecules of the invention can be produced by cDNA cloning, gnomic cloning, cDNA 
synthesis, polymerase chain reaction (PCR) or a combination of these approaches (Current 
Protocols in Molecular Biology, F.M. Ausbel et al., 1989). Sequences maybe synthesized using 
well-known methods and equipment, such as automated synthesizers. 
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Hybridization 

Other functional equivalent forms of the SI SEQ ID NO: 1 and SEQ ID NO: 2 and MS2 DNA 
SEQ ID NO: 3 and SEQ ID NO: 4 molecules can be isolated using conventional DNA-DNA or 
DNA-RNA hybridization techniques. These nucleic acid molecules and the SI SEQ ID NO: 1 
and SEQ ID NO: 2 and MS2 sequences can be modified without significantly aJSecting their 
activity. 

The present invention also includes nucleic acid molecules that hybridize to one or more of the 
DNA sequences provided in SEQ ID NO: 1 to SEQ ID NO:4 (or a partial sequence thereof or 
their complementary sequence). Such nucleic acid molecules preferably hybridize to all or a 
portion of SI SEQ ID NO: 2 or MS2 SEQ ID NO: 4 or their complement under low, moderate 
(intermediate), or high stringency conditions as defined herein (see Sambrook et al. (most recent 
edition) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, N.Y.; Ausubel et al. (eds.), 1995, Current Protocols in Molecular Biology, (John 
Wiley & Sons, NY)). The portion of the hybridizing nucleic acids is typically at least 15 (e.g. 
20, 25, 30 or 50) nucleotides in length. The hybridizing portion of the hybridizing nucleic acid is 
at least 80% e.g. at least 95% or at least 98% identical to the sequence or a portion or all of a 
nucleic acid encoding SI or S2 or their complement. Hybridizing nucleic acids of the type 
described herein can be used, for example, as a cloning probe, a primer (e.g. a PCR primer) or a 
diagnostic probe. Hybridization of the oligonucleotide probe to a nucleic acid sample typically 
is performed under stringent conditions. Nucleic acid duplex or hybrid stability is expressed as 
the melting temperature or Tm, which is the temperature at which a probe dissociates horn sl 
target DNA. This melting temperature is used to define the required stringmcy conditions* If 
sequences are to be identified that are related and substantially identical to the probe, rather than 
identical, then it is useful to first establish the lowest temperature at which oxdy homologous 
hybridization occurs with a particular concentration of sah (e.g. SSC or SSPE). Then, assuming 
that 1% mismatching results in a 1 degree Celsius decrease in the Tm, the temperature of the 
final wash in the hybridization reaction is reduced accordingly (for example, if sequences having 
greater flian 95% identity with the probe are sought, the final wash temperature is decreased by 5 
degrees Celsius). In practice, the change in Tm can be between 0.5 degrees Celsius and 1.5 
degrees Celsius per 1% mismatch. Low stringency conditions involve hybridizing at about: 
IXSSC, 0.1% SDS at 50*'C. High stringency conditions are: O.IXSSC, 0.1% SDS at SS^'C. 
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Moderate stringency is about IX SSC 0. 1 % SDS at 60 degrees Celsius. The parametm of salt 
concentration and temperature can be varied to achieve the optimal level of identity between the 
probe and the target nucleic acid. 

The present invention also includes nucleic acid molecules from any source, whether modified or 
not, that hybridize to genomic DNA, cDNA, or synthetic DNA molecules that encode. A nucleic 
acid molecule described above is considered to be functionally equivalent to a SI nucleic acid 
molecule SEQ ID NO: 1 f the present invention if the sequence encoded by the nucleic acid 
molecule is recognized in a specific manner by streptavidin and is elutable by biotin. A nucleic 
acid molecule described above is considered to be functionally equivalent to a MS2 SEQ ID 
4nucleic acid molecule of the present invention if the sequence encoded by the nucleic acid 
molecule is recognized in a specific manner by coat binding protein and is elutable by 
Glutatbionine-S-Transferase (GST)-coat binding protein fusion protein. 

Vectors 

The present invention provides an expression vector comprising a transcription cassette. The 
transcription cassette can be cloned into a variety of vectors by means that are well known in the 
art. Such a vector may comprise a suitably positioned restriction site or other means for insertion 
of a transcription cassette. The vector may also contain a selectable marker. For use in an assay 
or experiment, commercially available vectors such as CMV Casper promoter vector may be 
employed. For use in gene therapy, vectors such as adenovirus, may be employed. Cell cultures 
transformed with the DNA sequences of the current invention are useful as research tools 
particularly for studies of RNA-protein complexes. One skilled in the art will appreciate that 
there are a wide variety of suitable vectors. 

Host Cells 

A further aspect of the present invention provides a host cell containing a transcription cassette 
of tiie currmit invention. Examples of particulariy desirable host cells include ES, P19, COS, S2» 
SF9 cells. Methods known in the art for transformation, include but are not limited to 
electroporation, rubidium chloride, calcium chloride, calcium phosphate or chloroquine 
transfection, viral infection, phage transduction, microinjection, and the use of cationic lipid and 
lipid/amino acid complexes or of liposomes, or a large variety of other commercially available 
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and readily synthesized transfection adjuvants, are useful to transfer the vectors of the current 
invention into host cells. Host cells are cultured in conventional nutrient media. Hie media may 
be modified as appropriate for inducing promoters, amplifying nucleic acid sequences of interest 
or selecting transformants. The culture conditions, such as temperature, composition and pH will 
be apparent After transformation, transformants may be identified on the basis of a selectable 
phenotype. 

RNA fusion molecules 

The current invention provides for RNA fusion molecules comprising RNA tags, insulator 
elements and target RNA sequences. A target RNA sequence may be an oligoribonucleotide 
sequence or a ribonucleic acid sequence. Generally, for use in this invention, the target RNA 
sequence is RNA, including ribosomal RNA, RNA encoded by a gene, messenger RNA, UTRs, 
ribozyme RNA, catalytic RNA, small nuclear RNA, small nucleolar RNA, etc., fixjm a 
microorganism, or an RNA expressed by a cell infected with a virus, or RNA ftom a host cell, or 
RNA encoded by a genomic sequence; or RNA encoded by a chemically synthesized DNA 
sequence or random RNA encoded by randomly isolated DNA. Insulator elements may be 
placed on either side of the RNA tags and function to ensure proper folding of the RNA tags and 
to discourage interactions between the tags and the target RNA sequence. Examples of suitable 
insulator elements include, but are not limited to stretches of 4-5 identical nucleotides (eg 8-10 
adenosines ) coupled with paired restriction sites that do not interact with the tag or bait 
sequences. The S' and 3* restriction sites need to be identical as these sequences will hybridize 
forming a stem that forces the ''insulator" polynucleoside sequences to be ''unpaired'' thus 
isolating the folded tag stem loop structure from the remainder of the RNA sequences produced 
fix)m a specific vector. Insulator elements may also be called spacers. 

Method of Isolating 

The present invention relates to a method for isolating an RNA-protein complex formed in vm> 
comprising: 

(a) expressing in a eukaryotic cell an RNA fusion molecule of the current invention, 

(b) generating a whole cell extract; 

(c) passing the extract over a first solid support comprising streptavidin protein; 
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(d) eluting a first eluate with the addition of biotin; 

(e) collecting the first eluate; 

(£) passing the first eluate over a second solid support comprising MS2 coat piotein; 

(g) eluting a second eluate with the addition of a reagent selected fi^om the gcoiq) consisting 
of glutathione, RNAse or a denaturant; and 

(h) collecting the second eluate, wherein the second eluate contains the isolated RNA-protein 
complex. , 

The present invention also relates to a method for isolating an RNA-protein complex foimed m 
vitro comprising: 

(a) expressing the RNA fusion molecule of claim 1 1, 12, 13, 14, 15, or 16 ii? vitro\ 

(b) obtaining a whole cell extract; 

(c) passing the whole cell extract over a first solid support comprising strq>tavidin protein; 

(d) eluting a first eluate with the addition of biotin; 

(a) collecting the first eluate; 

(b) passing the first eluate over a second solid support con^>rising MS2 coat protein; 

(c) eluting a second eluate with the addition of a reagent selected fiom the group consisting 
of glutathione, RNAse or a denaturant;and 

(d) collecting the second eluate, whmin tfie second eluate contains the isolated RNA-piotein 
conoplex. 

The isolated protein part of tfie RNA-protein complex may then be identified by various methods 
and techniques including but not limited to SDS-page, sUver staining, Westem blotting and mass 
spectrometry. 

Examples of suitable solid supports for use with the different ^bodimmts of fiie cunent 
invention include afBnity columns comprising boimd streptavidin or bound MS2, wherein the 
MS2 can be botmd to agarose or sepharose beads. MS2 afiBnity columns can also be made by 
crosslinking to resins such as affigel beads, or binding as a fitsion protein to an appropriate resin 
(eg GST-MS2 to glutathione beads). 

Method of Screening 

The cuirent invention relates to a method of screening for a compound that modulates the 
formation of an RNA-protein complex formed in vivo comprising: 
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(a) expressing in a eukaryotic cell an RNA fusion molecule of the instant invention, in the 
presence of a test compound; 

(b) generating a whole cell extract; 

(c) passing the extract over a first solid support comprising streptavidin protem; 

(d) eluting a first eluate with the addition of biotin; 

(e) collecting the first eluate; 

(f) passing the first eluate over a second solid support comprising MS2 coat protein; 

(g) eluting a second eluate with the addition of a reagent selected fi-om the group consisting of 
glutathione, RNAse or a denaturant; 

(h) collecting the second eluate, wherein the second eluate contains the isolated RNA-protein 
complex; 

(i) measuring the amount of isolated RNA-protein complex present; and 

0) conqiaring the amount of isolated RNA-protein complex present in the absence of the 
compound to be tested. 

Another embodunent of the current invention relates to a method of scre^ng for a compound 
that modulates the formation of an RNA-protein complex formed in vitro comprising: 

(a) expressing the RNA fiision molecule of claun 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, or 17 in 
vitro; 

(b) obtaining a whole cell extract; 

(c) passing the whole cell extract over a first solid support comprising streptavidin protein; 

(d) eluting a first eluate wifli the addition of biotin; 

(e) collecting the first eluate; 

(f) passing the first eluate over a second solid siqiport comprising MS2 coat protein; 

(g) eluting a second eluate with the addition of a reagent selected fix>m the group consisting 
of glutathione, RNAse or a denaturant; 

(h) collecting the second eluate, wherein the second eluate contains the isolated RNA-protein 
complex; 

(i) measuring the amount of isolated RNA-protein complex present; and 

0) comparing the amount of isolated RNA-protein complex present in the absence of the 
compound to be tested. 
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Other assays (as well as variations of the above assays) will be apparent fix)m the description of 
this invention. For example, the test compound may be either fixed or increased, a plurality of 
compounds or proteins may be tested at a single time. "Modulation" can refer to enhanced 
formation of the RNA-protein complex, a decrease in formation of the RNA-piotem complex, a 
change in the type or kmd of the RNA-protein complex or a complete inhibition of formation of 
the RNA-protein complex. Suitable compounds that may be used include but are not limited to 
proteins, nucleic acids, small molecules, hormones, antibodies, peptides, antigens, cytokines, 
growth factors, pharmacological agents including chemotherapeutics, carcinogenics, or other 
cells (i.e. cell-ceU contacts). Screening assays can also be used to map binding sites on RNA or 
protein. For example, tag sequences encoding for RNA tags can be mutated (deletions, 
substitutions, additions) and then used in screening assays to deteimine the consequences of the 
mutations. 

Kits 

The invention includes kits for detecting RNA-protein complexes comprising at least one 
isolated DNA construct of the invention or at least one vector of the cuirent invention. 

Tandem RNA purification 

A number of RNA motifs suitable as RNA affinity tags exist We first tested five of these for 
potential use in our double-taggmg system. These include the "streptotag**, a streptomycin 
binding aptamer (Bachler et al., 1999), "Sl",a streptavidin binding aptamer (Srisawat and 
Engelke, 2001), "Dl", a sephadex binding aptamer (Srisawat et al., 2001), the MS2 phage coat 
protein binding RNA (Jurica et al., 2002) and tiie lambda phage nut RNA. Table 1 shows the 
relative binding and elution eflSciencies of each ^^P -labeled tag and its ligand. Two of the five 
tags, the streptavidm (SI, SEQ ID NO: 1 and SEQ ID NO: 2) and MS2 coat protein (MS2) tags, 
were foimd to bind and elute efiSciently under the desired purification conditions- finportantly, 
neither tag cross-reacted with any of the other tested ligands. Greater than 95% of the SI tag 
SEQ ID NO: 1 and SEQ ID NO: 2 bound to streptavidin agarose beads, and could be recovered 
quantitatively with the addition of biotin. Approximately 80% of the loaded MS2 tag bound to 
GST-coat protein- beads, and approximately 70% of the loaded tag could be eluted with 
glutathione. 
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RNA 
aptamer 



Streptotag 



MS2, 
2xMS2 
SI 

D8 

Nut 1-39 



SEQIDNO 



9- DNA 

10- RNA 
4,6-DNA 
5,8-RNA 
1-DNA 
3- RNA 
15 -DNA 
16 -RNA 
12- DNA 
14 -RNA 



64 
38,96 
68 
64 
33 

Table 1 - RNA ^tamer tags tested for use iii 



Length 
(nudeotides) 



Afllnity tai|;et 



8-hydroxy- 
streptomycin 
Coat Binding 

Protein 
Streptavidin 

Sephadex 

N-protein 1-22 

TRAP vectors 



Elated with: 



Streptomycm 

Reduced 
Glutathione 
Biotin 

n/a 

n/a 



% Bound 



21% ±2% 
73% ±3% 

>99% 
34% ±1% 

<1% 



%Etated 



12% ±6% 
68% ±8% 
94% ±5% 
21% ±10% 
<1% 



Nract tiie ability of flie Strqptavidin and MS2 coat protein tags to fimction toge&er and in the 
presence of an RNA target molecule was tested. Cassettes containing a T7 piomoto; die two 
RNA tags, altonative target RNA insertion sites and a poly A tail were made (Figure IB), 
hisulator elem^ts, consisting of 8-10 Adenosines flanked by idoitical restriction sites, were 
placed on either side of each tag to ensure proper folding of the tags and to discourage 
interactions between the tags and the inserted target RNA. '^P-Iabeled RNAs were first tested for 
retention and elution on streptavidin and GST-coat protein columns. Both tags worked with 
much the same efiBdency as when used individually. A construct containing 1 SI tag SBQ ID 
NO: 1 and SEQIDNO: 2 and 2 MS2 tags appeared to work best 

TRAP tagpiuyiceaion using in vitm transcribed UNA 

Nwct the constructs were tested for fbs ability to purify specific RNA binding proteins ftom a 
complex protein mixture. Two, approximately 100-nucleotide long elonents from the 
Drosophila wingless ^ne mRNA (WLEl and WLE2) were chosen for this purpose. These 
elements are required for the asymmetrical localization of wingless transcrq>ts to q>ical 
cytoplasm (Simmonds et al.. 2001). The two elemmts show no similarity in sequence or 
predicted secondary structure and exhibit marked differences in their ability to localize 
transcripts. On the other hand, both appear to mediate localization via dynein-dq»endent 
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microtubule transport (Wilkie and Davis, 2001). Hence, they probably interact with unique but 
overlapping subsets of proteins. 

The tagged RNAs were expressed in vitro, and the cold RNA mixed for 30 minutes with 
Drosophila embryo extracts prior to purification over the two columns. Figure 2A shows that 
each of the tagged localization elements did indeed associate with different subsets of proteins 
that were not bound by beads or tags alone. 9 of 1 9 proteins identified by Mass spectrometry are 
known or predicted RNA binding proteins (Simmonds and Krause, in preparation). Figure 2B 
shows that one of these proteins is Bic-D. Bic-D has been previously implicated as a protein 
required for apical mRNTA transport in blastoderm stage Drosophila embryos (Bullodc and Mi- 
Horowicz, 2001). 

Localization ofTRAP-tagged WLERNAs in Drosophila embryos 

The final test was to ensure that complexes formed on the tagged RNAs in vivo are both active 
and readily purified. To confirm this, tagged WLE constructs ware first fluorescently labeled and 
injected into syncitial blastoderm stage embryos. RNAs with an ^ical localization motif will 
move fiom the site of injection upwards, between the syncitial nuclei to the apical surfece 
(BuUock and Isb-Horowicz, 2001). Figure 3A shows untagged WLE2 RNA after localization to 
the apical surfece. Figure 3C shows that TRAP-tagged WLE2 RNA localizes to the apical 
suifece in an indistinguishable feshion. Thus, the tags ^pear to have no effect on the fimction of 
Has localiziiig element. TRAP-tagged wingless localization elements expressed in transgenic 
embryos also localized qncaUy (data not shown). Extracts were made firom these transgenic fly 
lines and used for purificati(m of WLE-associated protems. 

TRAP tagpwiftcation using RNA expressed in vivo. 

Figure 4 shows that, as in vitro, eadbt of the tagged WLE constructs binds a different subset of 
proteins. The identities of some of the se i»oteins were determined by Mass Spectrometry. Once 
again, one of flie purified proteins included Bic-D. 

Note tiwt, although ibs proteins identified here were eaaly detected using a small amount of 
extract and silver staining, the reversibility of the two columns permits the optional use of a 
second round of purification to d^ect proteins of very low abundance and proteins that do not 
bind the bait stoichiometrically or in all cell ^es. The SI tag SEQ ID NO:l SEQ ED NO: 2 is 
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particularly well suited for a second round of purification. It provides high degrees of 
purification with little loss of material, and the biotin used for elution is easily removed. Biotin 
removal is achieved by running the eluate over an avidin column (the SI tag SEQ ID NO:l SEQ 
ID NO: 2 does not bind avidin). The flow-through is then bound to the second streptavidin 
column and eluted with biotin as before. This approach can also be used for prior removal of 
streptavidin binding proteins, should they be present in extracts in large amounts. 
Clearly, this approach is applicable to any cell or tissue type. The TRAP cassette is simply 
placed into an appropriate vector. Although the in vivo application of the method is the most 
powerful version of this approach, in vitro assays are also clearly applicable. For example, using 
mutagenesis, the importance of specific nucleotides and stmctural aspects of known or newly 
discovered interactions can be rapidly tested with in vitro expressed RNAs and then confirmed in 
vivo. This approach is also amenable to high throughput analyses. This is particularly tme for in 
vitro work with extracts, and with transfected or virally infected cells. With a little more effort, 
the approach can also be applied to transformed cells and transgenic tissues. For example, as has 
been done for pioteins in yeast, TRAP tags could be placed within each yeast gene and 
substituted for the endogenous gene by homologous recombination. However, this approach is 
probably the most useful for small RNAs and fimctionally characterized RNA motifs. It is also 
possible to identify other RNAs bound within TRAP-purified complexes. This can be achieved 
either by RTPCR, or more globally by labeling the RNAsand hybridizing to microarrays. 
Given the rapidly growing number of important processes controlled by RNAs and the proteins 
that bind them, TRAP-tagging should prove to be a key tool in the elucidation of these functions 
on a genomic scale. Once well characterized, fimctional RNA elements can serve as drug targets 
(RNAi etc). Viral RNAs such as HIV, hepatitis B, and ttie proteins that bind tibtem, are 
particularly applicable targets.Bxamples of such uses include Ifaetreatment of viral infections, the 
control of cellular proliferation and the stimulation of neuronal regeneration. 

Vector construction 

The initial TRAP vectors were constructed using a cassette-based approach to allow for 
maximimi versatility and ease in transferring into specialized transgenesis vectors. The initial in- 
vitro expression vectors were constructed using the pSP72 cloning vector (Promega) as a 
backbone. This vector has a S* T7 RNA polymerase site. The sequence for the SI SEQ ID NO:l 
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SEQ ID NO: 2 and MS2 affinity tags were inserted using a pair of hybridized oligonucleotides. 
For the streptavidin aptamer the sequences were: 

S15:gffl5'- ATCTAAAAGACCGACCAGAATCATGCAAGTGCGTAAGATAGTCGCGGGCCGCKjAAAA^ 
S 1 3 Bgin3 •-ATCTTTTTTCCCGGCCCGCGACTATCTTACGCAC^ 

that produce a SI cassette: SEQ ID NO: 1 when inserted into the Bglll site of pSP72 (See Figure 

2). The MS2 aptamer was created from four linked oligonucleotides MS2#1 5* 

CAAACGACTCTAGAAAACATGAGGATCACCCATGTCTGCAGG 

MS2#1 3* TXXSACCTGCAGACATGGGTGATCCTCATGTTTTCTAGAGTCGT^^ 

MS2 #2 5* TCXIACTCTAGAAACATGAGGATCACCCATGTCTGCAGGTCAAAAAGAGCT and MS2 #2 

3' CTTirrGACCTGCAGACATGGGTGAtCCn^ that form a cassette containing 

two MS2 hairpins SEQ ID NO:6 when cloned into the pSP72 SacTL site. This vector was then 

sequenced to ensure that the s^tamer sequences were in the conrect orientation. Other primers 

used to cieate the other tags tested included: S'Streptotag Kpnl 

CAAAAGGATCGCATTTGGACnTCTGCCCAGGQTGGCACCATOTX^^ 

3' StreptotagKpnl 

CnTTTGGATCCGACCGTGGTGailACCCTGGGCAGAAGT which yvhea 

hybridized and cloned into the Kpnl site of pSP72 produce the Streptotag cassette SEQ ID NO: 9, 
N-5' Kpnl - GATCCTTTTCGGGTGAAAAAGGGCTTTTG N3' Kpnl 

GATCCAAAAGCCCTTTTTCAGGGCAAAa that when hybridized and cloned into the I^pnl site of pSP72 
produce tihe Nut cassette SEQ ID NO: 12. Also a pre-naade cassette D8 encoding a Sephadex binding faaiipin was 
also tested SEQ ID NO: 15 (Srisawat et aL, 2001)* 

The subsequent vectors pTRAPSl, pTRAPMS2, pTRAPSlMS2, pTRAPN, pTRAPSlN 
pTRAPSlDS, and pTRAP D8 all contain several sites for clonmg bait sequences, 
(pTRAPSlMS2 is shown in Figure 2). 

To create m-vitro labeled RNA, the pTRAP vector was linearized by cutting to completion with 
XhoL This cut DNA was treated with phenol and chloroform to remove the restriction enzyme. 
A 25^1! RNA transcription reaction contains: Ipg of Imearized pTRAP DNA> Sjil of 5x T7 RNA 
polymerase buflfer (400mM Tris-HQ pH 8.0, 60mM MgCU), 5jil lOmM NTP mix, IjU 0.75mM 
Difhiothreitol (RNAse free), 20U of placental RNAse inhibitor (MBI), 15U of T7 RNA 
polymerase and RNAse free water to 25|.il. This reaction was incubated at 3TC for 2 hours and 
then the 
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Purification of GST-CP beads 

A coat protein GST fusion protein was made by subcloning a PGR fiagment consisting of flie 
entire opea reading fiame of coat protein gene, (with a BamHI restriction site added 3* and Xhol 
added 5') into the pGEX4T vector (Pharmacia). The GST fusion protein was expressed in E. coli 
BL21 cells grown at 37°C for 3 hours (ODeoo of 1.8) and then induced withlOOmM IPTG. for 4.5 
hours. Cells were pelleted in 250 ml aliquots, quick frozen in liquid nitrogen and stored for as 
long as 2 months at -70°C. Cell pellets were lysed by sonication (5 min at 50%) and bound to 
Glutathione-Sepharose beads (Pharmacia) following the manufactures directions. Purified beads 
can be stored in PBS for up to 1 month at 4 °C or alternatively the purified GST-Coat Protein can 
be eluted using reduced Glutathione (Sigma). The purified protein is then concentrated m a 
centricon filter (Millipore) and finally re-constituted in Ix PBS with 10% glycerol. This solution 
can be stored at -70°C for more duin six months. For use m TRAP purification, the purified 
protein can be re-bound to GST-sq>harose (5 mg/ml) or altwnatively, can be pre-bound to the 
SNA and Ihen bound to the affinity matrix. 

Transgenic Lines 

To CTeate transgenic Drosophila, flie BgmrPvUa firagement of pTRAPSlMS2+WLEl, 
pTRAPSlMS2+WLEl or pTRAPSlMS2 (no insert) was cloned into a BgiaStul site within the 
transposable elemmt based pCASPER-HS vector (Thummel and Pirrotta. 1992). These vectors 
pCASPER-TRAP-WLBl, pCASPER-TRAP-WLE2 and pCASPER-TRAP were llien introduced 
into Dros(9hiIa embryos by microinjection (Sinadling and Rubin, 1982). For eadi constmct, at 
least three indq>endent transgenic lines were isolated. 

Extract preparation 

TRAP Purification Buflfer (TPB) was used for all steps of the purification includmg isolation of 
the extract (5x stock sohition = SOOmM HEPES pH 7.4, SOmM MgCl, 40QmM NaQ, 0.5% 
Triton X-100). TPB working solution is made by diluting fiie 5x stock and adding protdnase 
inhibitor (Complete, EDTA free; Roche) and 0.3mM DTT. Drosophila embryos were coUected 
for 4 or 12 hours and aged an additional 4 hours. For each transgene. the level of trans^c RNA 
was detenmined empirically by s«ni-quantitative RT-PCR to aUow for RNA e}q)ression that was 
similar in level to the endogoious transcript For example, TRAP constructs containing the WLE 
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bait sequences, the transgenic embryos were induced using a 30 min heat pulse (36.S^C) and 
allowed to recover for 20 minutes. Following dechorionation, transfer embryos to a chilled 
doimce homogenizen All further steps are at 4^C Add enough TBP to cover embryos and 
homogenize using 10 strokes with a loose (A) and 10 strokes with a tight (B) pestle. Transfer 
homogenate to RNAse free l.Sml tubes and spin 10 minutes at 14000g. Transfer supernatant to a 
new tube and repeat imtil extract is clear (avoid lipid layer above the extract). Add an additional 
10% glycerol and freeze in liquid nitrogen. Lysates prq)ared in this way are approximately 
Smg/ml protein and can be stored for up to 3 months at -70°C. 

TRAP purification 

RNAse free conditions and solutions made with DEPC treated water were used thn>ug}iout. 
Biotin-related proteins were first removed from the extract by mixing the extract with Avidin 
agarose beads (Sigma). For each S00|il of extract, 100|xl settled volume of Avidin beads is 
washed 3 times in 800 |li1 TBP and then incubated with thawed lysate for 10 minutes at 4"C. The 
avidin beads are then removed by passing the mixture over an RNAse free mini chromatography 
column (Bio Rad). Eluates are collected and mixed with SO \xVwl extract, pre-equilibrated 
streptavidin agarose beads (Sigma). Ail^ gentle rocking for 1 h at 4^C, the mixture is added to a 
plugged mini column and allowed to settle. After elution, columns are washed three times with 
800|xl TBP. The bound RNA/protein complexes are then eluted by the addition of Biotin. 250^1 
of Biotm elution buffer, (Ix TBP + 5mM Biotin, Sigma). Incubate for Ihr at 4®C. Collect the 
eluate, wash the column once with an additional 250^1 Biotin elution buffer and pool the wash 
with the first eluate. An option at this point is to repurify the eluate over a second streptavidin 
column by removing the biotin (using Avidin-agarose beads as described above) and repeating 
the procediu^. 

The streptavidin eluate is then boimd to the GST-CP beads (described above). Equilibrate lOOjil 
of a 50% slurry of GST-CP sepharose beads per SOO^il of streptavidin eluate in IxTBP. Add the 
streptavidin column eluate and rock for 1 h at 4°C. Pour into a plugged mini column, let beads 
settle and then let lysate flow througji. Wash three times with 500^1 of IxTBP. As above, a Bio- 
Rad or Pierce protein assay can be used to determine total number of washes needed. 
The bound RNA and proteins can be eluted using glutathione, high salt, RNAse or various 
denaturants (eg. urea, SDS). Tlie column is capped and one bed volume of elution buffer added. 



25 



C3k 0240782S 2002-10-11 



For Glutafliione and RNAse elutions, the mixture is rocked for 1 hr at A°C. The Glutathione 
elution buffer is as described by Pharmacia, and RNAse elution buffer includes a 200^1 of 
2mg/mlRNAseA and 5000u/mlRNAse Tl (Fermentas) in Iml of RNAse buffer (lOmM Tris-HCl 
pH 7.5 and lOmM MgCfe). Wash the column three times with an equal amount of qipropriate 
buffer and pool the eluates. Proteins are then resolved by SDS PAGE and identified by "nypsin 
proteolysis, Mass-spectromety (Fenyo et al., 1 998) and submission of the data to genomic 
databases. 
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SEQUENCE LISTING 

<110> Krause, Henry 

Simmonds, Andrew 

<120> TRAP-Tagglng: a novel method for the identification and 
purification of HHA-protein complexes 

<130> 3110 0023 

<160> 16 

<170> Patentin version 3.0 

<210> 1 
<211> 68 
<212> DHA. 

<213> synthetic construct 
<400> 1 

atcgataaaa agaccgacca gaatcatgca agtgcgtaag atagtcgcgg gccgggaaaa 60 
aaatcgat 63 

<210> 2 
<211> 45 
<212> DNA 

<213> synthetic construct 
<400> 2 

gaccgaccag aatcatgcaa gtgcgtaaga tagtcgcggg ccggg 45 



28 



CA 02407825 2002-10-11 



<210> 3 

<211> 68 

<212> RNA 

<213> synthetic constzuct 



<400> 3 

aucgauaaaa agaccgacca gaaucaugca agugcguaag auagucgcgg gccgggaaaa 
aaaucgau 



<210> 4 

<211> 38 

<212> DNA 

<213> synthetic construct 



<400> 4 

gactctagaa acatgaggat cacccatgtc tgcaggtc 



<210> 5 

<211> 38 

<212> BlSOi 

<213> synthetic construct 

<400> 5 

gacucuagaa acaugaggau cacccauguc ugcagguc 

<210> 6 

<211> 96 

<212> DNA. 

<213> synthetic construct 



<400> 6 

gagctcaaaa acgactctag aaacatgagg atcaeccatg tctgcaggtc gactctagaa 
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acatgaggat accatgtctg caggtcaaaa gagctc 96 

<210> 7 

<211> 75 

<212> DNA 

<213> synthetic construct 



<400> 7 

cgactctaga aacatgagga tcacccatgt ctgcaggtcg actctagaaa catgaggata 60 
ccatgtctgc aggtc 75 

<210> 8 

<211> 96 

<212> RNA 

<213> synthetic construct 



<400> 8 

gagcucaaaa acgacucuag aaacaugagg aucacccaug ucugcagguc gacucuagaa 60 
acaugaggau accaugucug caggucaaaa gagcuc 96 

<210> 9 

<211> 64 

<212> DNA 

<213> synthetic construct 



<400> 9 

gtaccaaaag gatcgcattt ggacttctgc ccagggtggc accacgtgcg gatccaaaag 60 
gtac 64 

<210> 10 
<211> 46 
<212> DNA 
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<213> synthetic construct 



<400> 10 

ggatcgcatt tggacttctg cccagggtgg caccacgtgc ggatcc 



46 



<210> 11 

<211> 64 

<212> RNA 

<213> synthetic construct 

<400> 11 

guaccaaaag gaucgcauuu ggacuucugc ccaggguggc accacgugcg gauccaaaag 60 
guac 64 

<210> 12 

<211> 33 

<212> DNA 

<213> synthetic construct 



<210> 13 
<211> 15 
<212> DNA 

<213> synthetic construct 
<400> 13 

cgggtgaaaa agggc 15 



<400> 12 

gatccttttc gggtgaaaaa gggcttttgg tac 



33 



<210> 



14 



<211> 



33 
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<212> RNA 

<213> synthetic construct 



<400> 14 

gauccuuuuc gggugaaaaa gggcuuuugg uac 



<210> 15 

<211> 64 

<212> DNA 

<213> synthetic constzuct 



<400> 15 

ccgaccagaa gtccgagtaa tttacgtttt gatacggttg cggaacttgc tatgtgcgtc 
taca 

<210> 16 

<211> 64 

<212> RNA 

<213> synthetic construct 



<400> 16 

ccgaccagaa guccgaguaa uuuacguuuu gauacgguug cggaacuugc uaugugcguc 
uaca 
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We claim: 

1 . An isolated DNA construct comprising a transcription cassette, which construct 
comprises: 

(a) a promoter sequence; 

(b) a bait sequence operably linked to the promoter; 

(c) a transcriptional termination sequence, which comprises a stop signal for RNA 
polymerase and a polyadenylation signal for polyadenylase; and 

(d) at least two tag sequences. 

2. The isolated DNA construct of claim 1 further comprising at least three insulator 
sequences. 

3. The isolated DNA constmct of claim 1 or 2 wherein the tag sequences comprise at least 
one streptavidin binding sequence [SEQ ID NO:2] and at least one MS2 coat protein binding 
sequence [SEQ ID NO:4]. 

4. The isolated DNA constmct of claim 1 or 2 wherein the tag sequences comprise at least 
one sequence which hybridizes to streptavidin binding sequence [SEQ ID NO:2] and at least one 
sequence which hybridizes to MS2 coat protein sequence [SEQ ID NO:4] under high stringency 
hybridization conditions. 

5. An isolated DNA construct comprising a transcription cassette, which construct 
comprises: 

(a) a promoter sequence; 

(b) a bait sequence operably linked to the promoter; 

(c) a transcriptional termination sequence, which comprises a stop signal for RNA 
polymerase and a polyadenylation signal for polyadenylase; and 

(d) at least three tag sequences. 
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6. The isolated DNA construct of claim S further comprising at least four insulator 
sequences. 

7. The isolated DNA construct of claim S or 6 wherein the tag sequences comprise at least 
one streptavidin binding sequence [SEQ ID NO:2] and at least two MS2 coat protein binding 
sequences [SEQ ID NO:7]. 

8. The isolated DNA construct of claim 5 or 6 wherein the tag sequences comprise at least 
one sequence which hybridizes to streptavidin binding sequence [SEQ ID NO:2] and at least two 
sequences which hybridizes to MS2 coat protein sequence [SEQ ID NO:7] under high stringency 
hybridization conditions. 

9. A vector comprising the isolated DNA constmct of claim 1 » 2» 3, 4, 5, 6, 7, or 8. 

10. A host cell transformed with the vector of claim 9. 

11. An RNA fusion molecule comprising: 

(a) a target RNA sequence; and 

(b) at least two RNA tags, wherein at least one of the RNA tags interacts with a ligand in a 
reversible fashion. 

12. The RNA fusion molecule of claim 1 1 further comprising at least three insulators. 

13. The RNA fiision molecule of claim 1 1 or 12 wherein the RNA tags comprise at least one 
streptavidm binding tag [SEQ ID NO:3] and at least one MS2 coat protem bmding tag [SEQ ID 
NO:5]. 

14. An RNA fusion molecule con^>rising; 

(a) a target RNA sequence; and 

(b) at least three RNA tags, wherein at least two of the RNA tags interacts with a ligand in a 
reversible fashion. 
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1 S* The RNA fusion molecule of claim 14 further comprising at least 4 insulators. 

16. The RNA fusion molecule of claim 14 or IS wherein the RNA ta^ comprise at least one 
streptavidin binding tag [SEQ ID NO:2] and at least two MS2 coat protem binding tags [SEQ ID 
NO:7], 

17. A method for isolating an RNA-protein complex formed in vivo comprising: 

(a) expressing in a eukaryotic cell the RNA fusion molecule of claim 1 U 12, 13» 14, 1S» or 
16; 

(b) generating a whole cell extract; 

(c) passing the extract over a first solid support comprising streptavidin protein; 

(d) eluting a first eluate with the addition of biotin; 

(e) collecting the first eluate; 

(f) passing the first eluate over a second solid support comprising MS2 coat protein; 

(g) eluting a second eluate with the addition of a reagent selected fi^om the group consisting 
of glutathione, RNAse or a denaturant; and 

(h) collecting the second eluate, wherein the second eluate contains the isolated RNA-protein 
complex. 

18. A melliod of identifying a protein in an RNA-protein complex comprising the method of 
isolating an RNA-protdn ootapUx according to the method of claim 17 and identifying the 
protein in the SNA-in'Otein complex. 

19. A protein identified by the method of claim 18. 

20. A method for isolating an RNA-protein complex fonned in vitro conqnising: 

(a) escpiessuig the RNA fusion molecule of claim 11,12, 13, 14, IS, or 16 in vitrei 

(b) obtaining a whole cell extract; 

(c) passing the whole cell extract over a first solid support comprising streptavidin protein; 

(d) eluting a first eluate with the addition of biotin; 

(e) collecting the first eluate; 
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(f) passing the first eluate over a second solid support comprising MS2 coat protein; 

(g) eluting a second eluate with the addition of a reagent selected from the group consisting 
of glutathione, RNAse or a denaturant;and 

(h) collecting the second eluate, wherein the second eluate contains the isolated RNA-piotein 
complex. 

2 1 . The method of claim 20, wherein the steps (c) to (e) are repeated. 

22. A method of identifying a protein in an RNA-protein complex comprising the method of 
isolating an RNA-protein complex according to the method of claim 20 and identifying the 
protein in the RNA-protein complex. 

23. A protein identified by the method claim 22. 

24. A method of screening for a compound that modulates the formation of an RNA-protein 
complex formed in vivo comprising: 

(a) expressing in a eukaryotic cell the RNA fiision molecule of claim 11, 12, 13, 14, 15 or 
16, in the presence of a test compoimd; 

(b) generating a whole cell extract; 

(c) passing the extract over a first solid support comprising streptavidin protein; 

(d) eluting a first eluate with the addition of biotin; 

(e) collecting the first eluate; 

(f) passing the first eluate over a second solid support comprising MS2 coat protein; 

(g) eluting a second eluate with the addition of a reagent selected Gcom the group consisting 
of glutathione, RNAse or a denaturant; 

(h) collecting the second eluate, wherein the second eluate contains the isolated RNA-protdn 
complex; 

(i) measuring the amount of isolated RNA-protein complex present; and 

G) comparing the amount of isolated RNA-protein complex present in the absence of the 
compound to be tested. 



36 



C& 02407823 2002-10-lX 



25. A method of screening for a compound that modulates the formation of an RNA-protein 
complex formed in vitro comprising: 

(a) expressing the RNA fusion molecule of claim 7, 8» 9, 10» 1 1, 12» 13, 14, IS, 16, or 17 in 

vitro; 

(b) obtaining a whole cell extract; 

(c) passing the whole cell extract over a first solid support comprising streptavidin protein; 

(d) eluting a first eluate with the addition of biotin; 

(e) collecting the first eluate; 

(f) passing the first eluate over a second solid support comprising MS2 coat protein; 

(g) eluting a second eluate with the addition of a reagent selected &om the group consisting 
of glutathione, RNAse or a denaturant; 

(h) collecting the second eluate,, wherein the second eluate contains the isolated RNA-protein 
complex; 

(i) measuring the amount of isolated RNA-protein complex present; and 

(j) comparing the amount of isolated RNA-protein complex present in the absence of the 
compound to be tested. 

26. The method of claim 25, wherein steps (c) to (e) are repeated. 

27. A kit for detecting RNA-protein complexes comprising the isolated DNA construct of 
claim U 2, 3, 4, 5, 6, 7, or 8. 

28. A kit for detectmg RNA-protdn conq>lexes comprising the vector of claim 9. 
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